Reusing clinical data can improve personalised care, support earlier diagnosis and deepen understanding of disease. Turning that into routine practice is hard. A Belgian glioblastoma initiative, TumorScope, set out to build a secure data environment linking MRI and PET-CT, tissue samples, genomic profiles and electronic health records. Over five years, the team met technical, legal and organisational barriers that slowed access, linkage and reuse. Challenges included fragmented data flows, uneven pseudonymisation and cautious interpretations of data protection and medical secrecy. The European Health Data Space Regulation (EHDS) aims to bring more coherence to secondary use, yet open questions remain on national roll-out, technical standards and intellectual property. Experience shows that compliance alone does not deliver reuse at scale without clear governance, defined roles and stable investment in infrastructure and skills. 

 

Fragmented Pipelines and Pseudonymisation Limits 

Multimodal projects generate data that sit in different systems, formats and teams. Imaging taken from hospital PACS must have identifiable DICOM tags removed or encoded and receive project-specific identifiers, with linkage keys stored securely by the hospital. Even when metadata are cleaned, MRI can allow facial reconstruction, so preprocessing such as skull stripping is needed to reduce re-identification risk. Burned-in pixel text on images or captured reports requires automated detection and masking. 

 

Genomic files such as FASTQ, BAM and VCF are highly identifying and rarely anonymisable. Access is restricted to trained, authorised users, with laboratory identifiers replaced by neutral codes. Tissue workflows add traceability via label photos and time-stamped microscopy images, which can link back to a patient within a complete dataset. Clinical records combine structured content, including classifications like SNOMED CT, with free text that is harder to standardise. 

 

Because each stream is de-identified and governed on its own, linking at patient level depends on a shared pseudonymisation key or a harmonised method across departments. Without this, approved protocols struggle to combine imaging, pathology, genomics and clinical variables. In practice, the lack of a central registry for MRI and genetic data split work into parallel tracks. Mutation prediction proceeded using public datasets while imaging analysis progressed separately, with harmonisation only a future aim. Public repositories also limited representativeness to single biopsies per patient, which does not reflect glioma heterogeneity or enable spatially resolved genotype–phenotype mapping. 

 

Operational Gaps and Trust Constraints 

Perceptions at the front end can hide constraints at the back end. Clinicians often phrase requests using the interface they see daily, while engineers must query databases managed by different custodians with varied rules. Ambiguities matter. A request about medication may mean every administration, a specific time window or dose patterns, each needing different extraction logic. Free-text fields increase cleaning effort and slow delivery. 

 

Must Read: Smart PSS Framework to Improve Healthcare Data and Performance 

 

Trust and control shape decisions. Custodians may be reluctant to move pseudonymised datasets to external compute environments due to oversight and misuse concerns. Keeping data within the institution shifts responsibility to internal teams that must maintain secure environments, high-performance computing and specialist staff. Medical Research Ethics Committees can authorise projects that later prove infeasible when access, interoperability or infrastructure limits surface during implementation. Financial pressure compounds these issues because secure processing, storage and expert personnel carry recurring costs. 

 

Even when access is possible, coordination is essential. A reproducible pseudonymisation process or a designated data manager who can extract identifiable elements link them and apply consistent pseudonyms, prevents drift across projects. Without this, teams underestimate preparation and linkage work. Clear mapping of available datasets, metadata, access conditions and reuse constraints would make planning more realistic. Catalogues that surface these details upfront reduce repeated discoveries and help align expectations. 

 

Law, Ethics and the EHDS in Practice 

Fear of non-compliance can drive restrictive readings of legal frameworks. The General Data Protection Regulation (GDPR) promotes risk-based safeguards, with pseudonymisation as one option, yet debate continues about what counts as effective pseudonymisation and acceptable residual risk. Data Protection Impact Assessments (DPIAs) should record mitigations and proportionality, but they may be treated as hurdles where all risk must be eliminated before work proceeds. Imaging and genomics are often labelled inherently identifiable, leading to lengthy deliberation on thresholds and safeguards. 

 

Beyond data protection, national rules on medical secrecy impose criminally sanctioned confidentiality. Questions arise over whether GDPR legal bases allow well-regulated research disclosures or whether medical secrecy remains a separate duty that limits roles for data engineers without a therapeutic relationship. Contractual limits from earlier sponsored studies and intellectual property terms can also restrict reuse of data stored in clinical systems. Cybersecurity duties under the NIS2 Directive widen accountability for research institutions, encouraging caution in collaborative projects. 

 

The EHDS sets out a route for secondary use. Data users would apply to national Health Data Access Bodies, which maintain catalogues and, with mandatory participation by data holders, describe and share defined categories of electronic health data under standard terms. The Regulation prefers anonymised data where feasible and allows pseudonymised data when justified, with keys held by the access body and strict bans on re-identification. Practical uncertainties remain. Intellectual property and trade secret provisions permit protective conditions or even refusals where serious risks are identified, which could limit availability. Technical modalities will be set through implementing acts, leaving questions on standards, infrastructure and safeguards. Divergent national approaches may lead to inconsistent access unless guidance and resourcing are clear. 

 

Experience from an interdisciplinary glioblastoma initiative shows that successful secondary use requires more than legal clearance. Progress depends on shared governance between clinical, legal, ethical and technical teams, reproducible pseudonymisation or coordinated data management and sustained investment in secure environments, computing and skilled personnel. Where work stalled, causes were often organisational as well as legal, including unclear responsibilities, limited data literacy, infrastructure gaps and financial pressure. The EHDS offers a framework that could streamline access, but its effect will hinge on implementation, clarity around intellectual property and workable technical specifications. A proportionate, risk-based approach documented through robust DPIAs, supported by transparent catalogues of datasets and conditions, can reduce friction and enable ethical, high-quality research that benefits patients and society. 

 

Source: Journal of Healthcare Informatics Research 

Image Credit: iStock


References:

Van Scharen A, Cruyt K, Colon J et al. (2025) Unlocking Health Data for Research: Legal, Technical, and Organisational Lessons from a Belgian Interdisciplinary Case Study. J Healthc Inform Res: In Press. 



Latest Articles

health data governance, clinical data reuse, GDPR compliance, pseudonymisation, EHDS, digital health, data protection, healthcare research, data ethics, TumorScope Belgium, medical data security, patient privacy Reusing clinical data safely demands governance, not just compliance. Learn how TumorScope tackled barriers to data reuse.